-
-
Notifications
You must be signed in to change notification settings - Fork 50
V2.8.0 #409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V2.8.0 #409
Conversation
Signed-off-by: Michał Górny <[email protected]>
|
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( I do have some suggestions for making it better though... For recipe/meta.yaml:
This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/17876818413. Examine the logs at this URL for more detail. |
|
Ok, so issues so far:
|
|
The release notes make it sound like we'll need to double-check nvtx support as well
|
From what I understand, this means checking reverse dependencies. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
Wait, I'm reading the output wrong. Investigating further. |
|
Okay, I've learned more about Windows shell than I wanted to know, and I suspect delayed expansion did no work as expected. I've tried replacing |
|
Haha, the win+CUDA build manages to blow through the disk space despite using the largest runner alread
But perhaps the "size" of that runner is only measured in CPUs/RAM, not storage? Could we extend that @aktech @wolfv? |
|
Well, at least the Python version issue was fixed. Also, looks like my idea of using |
|
Le sigh, I've fetched the artifact and couldn't reproduce the segfaults locally. But I've noticed that our openblas+openmp constraint didn't work anymore. Let's try again. By the way, I'm wondering if we should perhaps skip CPU tests in CUDA builds. I see they're some of the longest tests in CI, and I suppose it's sufficient that we test them in CPU builds. |
Probably related to #407, which unfortunately isn't green either. |
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
8e57db5 to
9ce5214
Compare
|
Uh, so I guess CMake 4 breaks AArch64 builds? I'll try debugging that locally. |
|
BTW: Should we do that? |
|
I thought we had a workaround with a comment in our script for this |
|
linux-64 has a single test failure that looks like a minor tolerance violation |
Ah, sorry, indeed, I was doing a non-CUDA build and didn't notice it's there for CUDA. |
https://docs.cirun.io/reference/yaml#custom-disk-size-for-azure The windows runners configuration needs to be updated with: extra_config:
storageProfile:
osDisk:
diskSizeGB: 512 |
…5.09.18.12.55.27 Other tools: - conda-build 25.7.0 - rattler-build 0.47.0 - rattler-build-conda-compat 1.4.6
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Thanks to @aktech for the suggestion. Signed-off-by: Michał Górny <[email protected]>
|
Uh, I accidentally rerendered over the Windows fix 🤦. Let me read #413 up, in case I should change something before restarting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the persistence on this one! ❤️
Discussion about the pybind situation in #413 is still ongoing with the pybind maintainers, so a rebuild for v3 (or removing the dependence on pybind-abi completely) can be done in a follow-up.
No problem. I'm sorry it took this long — I have made more mistakes than i should have, notably failed to pin There's also the open question on how to deal with cudnn. I'm not even sure if this is something to report to PyTorch or to NVIDIA. |
|
@mgorny , Appreciate your effort here for us the PyTorch on Windows users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the open question on how to deal with cudnn. I'm not even sure if this is something to report to PyTorch or to NVIDIA.
I think we could start at least with an issue on the cudnn feedstock, at least write down the things you remember from that debugging session somewhere, before it becomes just a haze. 😅
|
Windows CUDA builds are still failing to upload, done manually from the artefacts: $ gh run download 17899219922 --repo conda-forge/pytorch-cpu-feedstock --name conda_artifacts_17899219922_win_64_channel_targetsconda-forge_maincu_hca575dce
$ unzip pytorch-cpu-feedstock_conda_artifacts_.zip
$ cd bld/win-64
$ rm current_repodata.json index.html repodata*
$ ls
libtorch-2.8.0-cuda128_mkl_ha34d6f4_300.conda pytorch-2.8.0-cuda128_mkl_py312_h0850830_300.conda
pytorch-2.8.0-cuda128_mkl_py310_h0b8c608_300.conda pytorch-2.8.0-cuda128_mkl_py313_hf206996_300.conda
pytorch-2.8.0-cuda128_mkl_py311_hd9a8a8a_300.conda pytorch-gpu-2.8.0-cuda128_mkl_h2fd0c33_300.conda
$ ls | xargs anaconda upload
$ DELEGATE=h-vetinari
PACKAGE_VERSION=2.8.0
for package in libtorch pytorch pytorch-gpu; do
anaconda copy --from-label main --to-label main --to-owner conda-forge ${DELEGATE}/${package}/${PACKAGE_VERSION}
done |
|
Presumably we'll need to restart that one failed AArch64 build — but I don't see the rerun button right now, so I guess it'll only appear when the other job is finished. Not sure if I can rerun it without rerunning the Windows build though. |
Just don't use the run-wide restart. It's possible to restart a single job |
|
Hi all, I'm unsure if this is the right place to report this, but there's an issue with the file pytorch-2.8.0-cpu_mkl_py311_h98f00f5_100.conda in this release.
|
Checklist
0(if the version changed)conda-smithy(Use the phrase@conda-forge-admin, please rerenderin a comment in this PR for automated rerendering)